System and method for utilizing audio encoding for measuring media exposure with environmental masking

ABSTRACT

An audio beacon system, apparatus and method for collecting information on a panelist&#39;s exposure to media. An audio beacon is configured as on-device encoding technology that is operative in a processing device (e.g., cell phone, PDA, PC) to enable the device to encode an environmental sound and transmit it for a predetermined period of time. The acoustically transmitted data is received and processed by a portable audience measurement device, such as Arbitron&#39;s Personal People Meter™ (“PPM”), or other specially equipped portable device to enable audience measurement systems to achieve higher levels of detail on panel member activity and greater association of measurement devices to their respective users.

RELATED APPLICATIONS

The present application is a continuation-in-part of U.S. patentapplication Ser. No. 12/425,464, titled “System and Method for UtilizingAudio Beaconing in Audience Measurement” filed Apr. 17, 2009, and U.S.patent application Ser. No. 12/425,556, titled “System and Method forUtilizing Supplemental Audio Beaconing in Audience Measurement,” alsofiled on Apr. 17, 2009. Both applications are assigned to the assigneeof the present application and are incorporated by reference in theirentireties herein

TECHNICAL FIELD

The present disclosure relates to systems and processes forcommunicating and processing data, and, more specifically, tocommunicate media data exposure that may include coding that providesmedia and/or market research.

BACKGROUND INFORMATION

The use of global distribution systems such as the Internet fordistribution of digital assets such as music, film, computer programs,pictures, games and other content continues to grow. In many instances,media offered via traditional broadcast mediums is supplemented throughsimilar media offerings through computer networks and the Internet. Itis estimated that Internet-related media offerings will rival and evensurpass traditional broadcast offerings in the coming years.

Techniques such as “watermarking” have been known in the art forincorporating information signals into media signals or executable code.Typical watermarks may include encoded indications of authorship,content, lineage, existence of copyright, or the like. Alternatively,other information may be incorporated into audio signals, eitherconcerning the signal itself, or unrelated to it. The information may beincorporated in an audio signal for various purposes, such asidentification or as an address or command, whether or not related tothe signal itself.

There is considerable interest in encoding audio signals withinformation to produce encoded audio signals having substantially thesame perceptible characteristics as the original unencoded audiosignals. Recent successful techniques exploit the psychoacoustic maskingeffect of the human auditory system whereby certain sounds are humanlyimperceptible when received along with other sounds.

Arbitron has developed a new and innovative technology called CriticalBand Encoding Technology (CBET) that encompasses all forms of audio andvideo broadcasts in the measurement of audience participation. Thistechnology dramatically increases the both the accuracy of themeasurement and the quantity of useable and effective data across alltypes of signal broadcasts. CBET is an encoding technique that Arbitrondeveloped and that embeds identifying information (ID code) or otherinformation within the audio portion of a broadcast. An audio signal isbroadcast within the actual audio signal of the program, in a mannerthat makes the ID code inaudible, to all locations the program isbroadcast, for example, a car radio, home stereo, computer network,television. etc. This embedded audio signal or ID code is then picked upby small (pager-size) specially designed receiving stations calledPortable People Meters (PPM), which capture the encoded identifyingsignal, and store the information along with a time stamp in memory forretrieval at a later time. A microphone contained within the PPMreceives the audio signal, which contains within it the ID code.

Further disclosures related to CBET encoding may be found in U.S. Pat.No. 5,450,490 and U.S. Pat. No. 5,764,763 (Jensen et at.) in whichinformation is represented by a multiple-frequency code signal which isincorporated into an audio signal based upon the masking ability of theaudio signal. Additional examples include U.S. Pat. No. 6,871,180(Neuhauser et al.) and U.S. Pat. No. 6,845,360 (Jensen et al.), wherenumerous messages represented by multiple frequency code signals areincorporated to produce and encoded audio signal. Other examples includeU.S. Pat. No. 7,239,981 (Kolessar et al.). Each of the above-mentionedpatents is incorporated by reference in its entirety herein.

The encoded audio signal described above is suitable for broadcasttransmission and reception and may be adapted for Internet transmission,reception, recording and reproduction. When received, the audio signalis processed to detect the presence of the multiple-frequency codesignal. Sometimes, only a portion of the multiple-frequency code signal,e.g., a number of single frequency code components, inserted into theoriginal audio signal, is detected in the received audio signal.However, if a sufficient quantity of code components is detected, theinformation signal itself may be recovered.

Other means of watermarking have been used in various forms to trackmultimedia over computer networks and to detect if a user is authorizedto access and play the multimedia. For certain digital media, metadatais transmitted along with media signals. This metadata can be used tocatty one or more identifiers that are mapped to metadata or actions.The metadata can be encoded at the time of broadcast or prior tobroadcasting. Decoding of the identifier may be performed at a digitalreceiver. Other means of watermarking include the combination of digitalwatermarking with various encryption techniques known in the art.

While various encoding and watermarking techniques have been used totrack and protect digital data, there have been insufficient advances inthe fields of cross-platform digital media monitoring. Specifically, incases where a person's exposure to Internet digital media is monitoredin addition to exposure to other forms of digital media (e.g., radio,television, etc.), conventional watermarking systems have shownthemselves unable to effectively monitor and track media exposure.Furthermore, there is a need to integrate exposure to digital mediaacross platforms where the digital media includes formats that are nottraditionally subject to audio encoding. Moreover, there is a need inthe art to properly “mask” such signals using environmental soundsand/or sounds native to a device that is conducting beaconing processes.

SUMMARY

Accordingly, an audio beacon system, apparatus and method is disclosedfor collecting information on a panelist's exposure to media. Under apreferred embodiment, the audio beacon is configured as on-deviceencoding technology that is operative in a panelist's processing device(e.g., cell phone, PDA, PC) to enable the device to acousticallytransmit user/panelist data for a predetermined period of time. Theacoustically transmitted data is received and processed by a portableaudience measurement device, such as Arbitron's Personal People Meter™(“PPM”) or specially equipped cell phone, laptop etc., to enableaudience measurement systems to achieve higher levels of detail on panelmember activity and greater association of measurement devices to theirrespective panelists. Additionally, the acoustic transmissions areconfigured to utilize environmental sounds that are advantageous inbeing less obtrusive to users.

Additional features and advantages of the various aspects of the presentdisclosure will become apparent from the following description of thepreferred embodiments, which description should be taken in conjunctionwith the accompanying drawings.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1A is a block diagram illustrating a portion of an audio beaconingsystem under one exemplary embodiment;

FIG. 1B is a block diagram illustrating another portion of an audiobeaconing system under the embodiment illustrated in FIG. 1A;

FIG. 2 is a tabular illustration of an audio beaconing and audiomatching process under another exemplary embodiment;

FIG. 3 illustrates a block diagram of a server-side encoding processunder yet another exemplary embodiment;

FIG. 4 illustrates an exemplary watermarking process for a digital mediafile suitable for use in the embodiment of FIGS. 1A-B;

FIG. 5 illustrates a block diagram of a client-side encoding processunder yet another exemplary embodiment;

FIG. 6 illustrates an exemplary audio waveform having encoded datatherein; and

FIG. 7 illustrates an exemplary device configured to selectenvironmental sounds for audio transmission.

DETAILED DESCRIPTION

FIG. 1A is an exemplary block diagram illustrating a portion of an audiobeaconing system 150 under one embodiment, where a web page 110 isprovided by a page developer and published on content server 100. Theweb page preferably contains an embedded video player 111 and audioplayer 112 (that is preferably not visible), together with anapplication programming interface (API) 113. Other content 114 (e.g.,HTML, text, etc.) is also provided on web page 110, which may or may notbe coupled through API 113. API 113 is preferably embodied as a set ofroutines, data structures, object classes and/or protocols provided bylibraries and/or operating system services in order to support the videoplayer 111 and audio player 112. Additionally, the API 113 may belanguage-dependent (i.e. available only in a particular programminglanguage) or language-independent (i.e., can be called from severalprogramming languages, preferably an assembly/C-level interface).Examples of suitable API's include Windows API, Java Platform API,OpenGL, DirectX, Simple DirectMedia Layer (SDL), YouTube API, FacebookAPI and iPhone API, among others.

In one preferred embodiment, API 113 is configured as a beaconing APIobject. Depending on the features desired, the API object may reside onan Audience Measurement (AM) server 120, so that the object may beremotely initialized, thus minimizing the objects software's exposure topossible tampering and to maintain security. Alternately, the API objectcan reside on the content server 100, where the API object may beinitialized under increased performance conditions.

When initialized, API 113 can communicate the following properties: (1)the URL of the page playing the media, (2) URL of the media being servedon the page, (3) any statically available media metadata, and (3) atimestamp. It is understood that additional properties may becommunicated in API 113 as well. In one configuration of FIG. 1 A aninitialization request is received by API 113, to create a code tonethat is preferably unique for each website and encode it on a smallinaudible audio stream. Alternatively, the AM server 120 could generatea pre-encoded audio clip 101, with a code tone, for each site andforward it on the content server 100 in advance.

The encoded audio stream would then travel from content server 100 tothe web page 110 holding audio player 110. In a preferred embodiment,audio player 110 may be set by the page developer as an object instance,where the visible property of player 110 is oriented as “false” or setto a one-by-one dimension in order to minimize the visual interferenceof the audio player with the web page. The encoded audio stream may thenbe played out in parallel with the media content being received from theweb page 110. The encoded audio stream would preferably repeat atpredetermined time periods through an on-device beacon 131 resident on auser device 130 as long as the user is on the same website. The beacon131, would enable device 130 to acoustically transmit the encoded audiostream so that a suitably configured portable device 140 (e.g., PPM) canreceive and process the encoded information. Beacon 131 could beembedded into an audio player resident on a web page being viewed insidethe browser on user device 130, or may be a stand-alone application onuser device 130.

A simplified example further illustrates the operation of the system 150of FIGS. 1A-B under an alternate embodiment. User device 130 requestscontent (e.g., http://www.hulu.com/) from server 100. When the contentis received in user device 130, PC meter software 132 collects andtransmits web measurement data to Internet measurement database 141. Oneexample of a PC meter is comScore's Media Metrix™ software; furtherexemplary processes of web metering may be found in U.S. Pat. No.7,493,655, titled “Systems for and methods of placing useridentification in the header of data packets usable in user demographicreporting and collecting usage data” and U.S. Pat. No. 7,260,837, titled“Systems and methods for user identification, user demographic reportingand collecting usage data usage biometrics”, both of which areincorporated by reference in their entirety herein.

As web measurement data is collected by PC meter 132, beacon 131acoustically transmits encoded audio, which is received by portabledevice 140. In the exemplary embodiment, the encoding for the beacontransmission may include data such as a timestamp, portable device ID,user device ID, household ID, or any similar information. In addition tothe beacon data, portable device 140 additionally receives multimediadata such as television and radio transmissions 142, which may or maynot be encoded, at different times. If encoded (e.g., CBET encoding),portable device can forward transmissions 142 to audio matching server160 (FIG. 1B) for decoding and matching with audio matching database161. If transmissions 142 are not encoded, portable device 140 mayemploy sampling techniques for creating audio patterns or signatures,which may also be transmitted to audio matching server 160 for patternmatching using techniques known in the art.

Audio beacon server 150, shown in FIG. 1B, receives andprocesses/decodes beacon data from portable device 140. Under analternate embodiment, it is possible to combine audio matching server160 and audio beacon server 150 to collectively process both types ofdata. Data from Audio beacon server 150 and audio matching server 160 istransmitted to Internet measurement database 141, where the webmeasurement data could be combined with audio beacon data and data fromthe audio matching server to provide a comprehensive collection ofpanelist media exposure data.

Under another exemplary embodiment, the video and audio players ofwebpage 110 are configured to operate as Flash Video, which is a fileformat used to deliver video over the Internet using Adobe™ FlashPlayer. The Flash Player typically executes Shockwave Flash “SWF” filesand has support for a scripting language called ActionScript, which canbe used to display Flash Video from an SWF file. Because the FlashPlayer runs as a browser plug-in, it is possible to embed Flash Video inweb pages and view the video within a web browser. Commonly, Flash Videofiles contain video bit streams which are a variant of the H.263 videostandard, and include support for H.264 video standard (i.e., “MPEG-4part 10”, or “AVC”). Audio in Flash Video files (“FLV”) is usuallyencoded as MP3, but can also accommodate uncompressed audio or ADPCMformat audio.

Continuing with the embodiment, video beacons can be embedded within anaction script that will be running within the video Flash Player's runtime environment on web page 110. When an action script associated withweb page 110 gets loaded as a result of the access to the page, thescript gets activated and triggers a “video beacon”, which extracts andstore URL information on a server (e.g., content server 100), andlaunches the video Flash Player. By inserting an audio beacon in thesame action script, the audio beacon will be triggered by the videoplayer. Once triggered, the audio beacon may access AM server 120 toload a pre-recorded audio file containing a special embedded compatiblecode (e.g., CBET). This pre-recoiled audio file would be utilized forbeacon 131 to transmit for a given period of time (e.g., every xseconds).

As a result, the beacon 131 audio player runs as a “shadow player” inparallel to the video Flash Player. If a portable device 140 is inproximity to user device 130, portable device 140 will detect the codeand reports it to audio beacon server 150. Depending on the level ofcooperation between the audio and video beacon, the URL information canalso be deposited onto beacon server 150 along with codes that wouldallow an audience measurement entity to correlate and/or calibratevarious measurements with demographic data.

Under the present disclosure, media data may be processed in a myriad ofways for conducting customized panel research. As an example, each userdevice 130 may install on-device measurement software (PC meter 132)which includes one or more web activity monitoring applications, as wellas beacon software 131. It is understood that the web activitymonitoring application and the beacon software may be individualapplications, or may be merged into a single application.

The web activity monitoring application collects web activities datafrom the user device 130 (e.g., site ID, video page URL, video file URL,start and end timestamp and any additional metadata about videositeinformation, URL information, time, etc.) and additionally assigns aunique ID, such as a globally unique identifier or “GUID”, to eachdevice. For the beacon 131, a unique composite ID may be assignedincluding a household ID (“HHID”) and a unique user device ID for eachdevice in the household up to 10 devices for a family), as well as aportable device ID (PPMID). Panelist demographic data may be includedfor each web activity on the device.

Continuing with the example, beacon 131 emits an audio beacon code (ABC)for device in the household by encoding an assigned device ID number andacoustically sending it to portable device 140 to identify the device.Further details on the encoding is provided below. Portable device 140collects the device ID and sends it to a database along with HHID and/orPPM ID and the timestamp. Preferably, a PPMID is always mapped to a inthe backend; alternately an HHID can be set within each PPMID.

The web activity monitoring and beacon applications may pass informationto each other as needed. Both can upload information to a designatedserver for additional processing. A directory of panelists' devices isbuilt to contain the GUID, HHID, and device ID for panel, and thedirectory could be used to correlate panelist demographic data and webmeasurement data.

Turning to FIG. 2, a tabular illustration of an audio beaconing andaudio matching process under another an exemplary embodiment isprovided. Specifically, the table illustrates a combination of audiobeaconing and audio matching and its application to track a video on acontent site, such as Hulu.com. FIG. 2. Timeline 200 shows in sections ascenario where a user/panelist plays a ten minute video on Hulu.com.Activities 201 shows actions taken in user system 150 where a video isloaded in the user device 130, and played. At the 5 minute mark (301sec.) a 15 second advertisement is served. At the conclusion of theadvertisement (316 sec.), the video continues to play until itsconclusion (600 sec.).

During this time, audio beacon activities 202 are illustrated, where,under one embodiment, on-device beacon 131 transmits continuous audiorepresenting the website (Hulu.com). In addition, beacon also transmitsa timestamp, portable device ID, user device ID, household ID and/or anyother data in accordance with the techniques described above. Under analternate embodiment shown in 203, additional data may be transmitted inthe beacon to include URLs and video ID's when a video is loaded andplayed. As the advertisement is served, an event beacon, which mayinclude advertisement URL data, is transmitted. At the conclusion of thevideo, a video end beacon is transmitted to indicate the user/panelistis no longer viewing specific media.

When the video and advertisement is loaded and played, additional audiomatching may occur in the portable device 140, in addition with audiomatching processes explained above in relation to FIGS. 1A-B. Referringto audio matching events 204, portable device data 205 and end-userexperience 206 of FIG. 2, portable device data (e.g., demographic IDdata) is overlayed along with site information (URL, video ID, etc.)when a video is loaded. When the video is played, audio signatures maybe sampled periodically by portable device 140, until a content match isachieved. The audio signatures may be obtained through encoding, patternmatching, or any other suitable technique. When a match is found,portable device data is overlayed to indicate that a content matchexists. Further signature samples are taken to ensure that the samecontent is being viewed. When an advertisement is served, the sampledsignature will indicate that different content is being viewed, at whichpoint the portable device data is overlayed in the system. When thevideo resumes, the audio signature indicates the same video is played,and portable device data is overlayed through the end of the video asshown in FIG. 2.

As explained above, signature sampling/audio matching allows the system150 to identify and incorporate additional data on the users/panelistsand the content being viewed. Under a typical configuration, the contentprovider media (e.g., Hulu, Facebook, etc.) may be sampled in advance toestablish respective signatures for content and stored in a matchingdatabase (e.g., audio matching server 160). The portable device 140would be equipped with audio matching software, so that, when a panelistis in the vicinity of user device 130, audio matching techniques areused to collect the signature, or “audio fingerprint” for the incomingstream. The signatures would then be matched against the signatures inthe matching database to identify the content.

It is understood by those skilled in the art however, that encodingtechniques may also be employed to identify content data. Under such aconfiguration, content is encoded prior to transmission to include datarelating to the content itself and the originating content site.Additionally, data relating to possible referral sites (e.g., Facebook,MySpace, etc.) may be included. Under one embodiment, a contentmanagement system may be arranged for content distributors to choosespecific files for a corresponding referral site.

For the media data encoding, several advantageous and suitabletechniques for encoding audience measurement data in audio data aredisclosed in U.S. Pat. No. 5,764,763 to James M. Jensen, et al., whichis assigned to the assignee of the present application, and which isincorporated by reference herein. Other appropriate encoding techniquesare disclosed in U.S. Pat. No. 5,579,124 to Aijala, et al., U.S. Pat.Nos. 5,574,962, 5,581,800 and 5,787,334 to Fardeau, et al., U.S. Pat.No. 5,450,490 to Jensen, et al., and U.S. patent application Ser. No.09/318,045, in the names of Neuhauser, et al., each of which is assignedto the assignee of the present application and all of which areincorporated by reference in their entirety herein.

Still other suitable encoding techniques are the subject of PCIPublication WO 00/04662 to Srinivasan, U.S. Pat. No. 5,319,735 toPreuss, et al., U.S. Pat. No. 6,175,627 to Petrovich, et al., U.S. Pat.No. 5,828,325 to Wolosewicz, et al., U.S. Pat. No. 6,154,484 to Lee, etal., U.S. Pat. No. 5,945,932 to Smith, et al., PCI Publication WO99/59275 to Lu, et al., PCT Publication WO 98/26529 to Lu, et al., andPCT Publication WO 96/27264 to Lu, et al, all of which are incorporatedby reference in their entirety herein.

Variations on the encoding techniques described above are also possible.Under one embodiment, the encoder may be based on a Streaming AudioEncoding System (SAES) that operates under a set of sample rates and isintegrated with media transcoding automation technology, such asTelestream's FlipFactory™ software. Also, the encoder may be embodied asa console mode application, written in a general-purpose computerprogramming language such as “C”. Alternately, the encoder may beimplemented as a Java Native Interface (JNI) to allow code running in avirtual machine to call and be called by native applications, where theJNI would include a JNI shared library for control using Java classes.The encoder payloads would be configured using specially written Javaclasses. Under this embodiment, the encoder would use the informationhiding abstractions of an encoder payload which defines a singlemessage. Under a preferred embodiment, the JNI encoder would operateusing a 44.1 kHz sample rate.

Examples of symbol configurations and message structures are providedbelow. One exemplary symbol configuration uses four data symbols and oneend symbol defined for a total of five symbols. Each symbol may comprisefive tones, with one tone coming from each of five standard Barks. Oneexemplary illustration of Bark scale edges (in Hertz), would be {920,1080, 1270, 1480, 1720, 2000}. The bins are preferably spaced on a4×3.90625 grid in order to provide lighter processing demands,particularly in cases using decoders based on 512 point fast Fouriertransform (FFT). an exemplary bin structure is provided below:

-   -   Symbol 0: {248, 292, 344, 400, 468}    -   Symbol 1: {252, 296, 348, 404, 472}    -   Symbol 2: {256, 300, 352, 408, 476}    -   Symbol 3: {260, 304, 356, 412, 480}    -   End Marker Symbol: {264, 308, 360, 416, 484}

Regarding message structure, an exemplary message would comprise 20symbols, each being 400 milliseconds in duration, for a total durationof 8 seconds. Under this embodiment, the first 3 symbols could bedesignated as match/check criteria symbols, which could be the simplesum of the data symbols or could be derived from an error correction orcyclical redundancy check algorithm. The following 16 symbols would thenhe designated as data symbols, leaving the last symbol as an end symbolused for a marker. Under this configuration, the total number ofpossible symbols would be 4¹⁶ or 4,294,967,296 symbols.

Variations in the algorithmic process for encoding are possible as wellunder the present disclosure. For example, a core sampling rate of5.5125 kHz may be used instead of 8 kHz to allow down-sampling from 44.1kHz to be efficiently performed without pre-filter (to eliminatealiasing components) followed by conversion filter to 48 kHz. Such aconfiguration should have no effect on code tone grid spacing since theoutput frequency generation is independent of the core sampling rate.Additionally, this configuration would limit the top end of the usablefrequency span to about 2 kHz (as opposed to 3 kHz under conventionaltechniques) since frequency space should be left for filters withpractical numbers of taps.

Under one embodiment, a 16 point overlap of a 256 point large FFT isused, resulting in amplitude updates every 2.9 milliseconds for encodinginstead of every 2 milliseconds for standard CBET techniques.Accordingly, fewer large FFTs are calculated under a tighter binresolution of 21.5 Hz instead of 31.25 Hz.

The psychoacoustic model calculations used for the encoding algorithmunder the present disclosure may vary from traditional techniques aswell. In one embodiment, bin spans of the clumps may be set by Barkboundaries instead of being wholly based on Critical Bandwidth criteria.By using Bark boundaries, a specific bin will not contribute to theencoding power level of multiple clumps, which provides less couplingbetween code amplitudes of adjacent clumps. When producing EquivalentLarge FFTs, a comparison may he made of the most recent 16 point SmallFFT results to a history of squared sums to simplify calculations.

For noise power computation, the encoding algorithm under the presentdisclosure would preferably use 3 bin values over a clump: the minimumbin power (MIN), the maximum bin power (MAX), and the average bin power(AVG). Under this arrangement, the bin values could be modeled asfollows:

  IF (MAX > (2 * MIN))  PWR = MIN ELSE  PWR = AVGHere, PWR may be scaled by a predetermined factor to produce maskingenergy.

A similar algorithm could also be used to create a 48 kHz native encoderusing a core sample rate of 6 kHz and a large FFT bin resolution of23.4375 Hz calculated every 2.67 milliseconds. Such a configurationwould differ slightly in detection efficiency and inaudibility from theembodiments described above, but it is anticipated that the differenceswould be slight.

With regards to decoding, an exemplary configuration would include asoftware decoder based on a JNI shared library, which performscalculations up through the bin signal-to-noise ratios. Such aconfiguration would allow an external application to define the symbolsand perform pattern matching. Such steps would be handled in a Javaenvironment using an information hiding extraction of a decoder payload,where decoder payloads are created using specially written Java classes.

Turning to FIG. 3, an exemplary server-side encoding embodiment isillustrated. In this example, content server 100 has content 320, whichincludes a media file 302 configured to be requested and played on mediaplayer 301 residing on user device 130. When media file 302 isinitialized, audio is extracted from the media file and, if the audio isencoded (e.g., MP3 audio), subjected to audio decoding in 304 to produceraw audio 305. To encode the audio for beaconing, device ID, HHID anchorPPMID data is provided for first encoding 306 the data into the rawaudio 305, using any suitable technique described above.

After the first encoding, the audio data is then subjected to a secondencoding to transform the audio into a suitable format (e.g., MP3) toproduce fully encoded audio 308, which is subsequently transmitted tomedia player 301 and beaconed to portable device 140. Alternately,encoded audio 308 may be produced in advance and stored as part of mediafile 302. During the encoding process illustrated in FIG. 3, care mustbe taken to account for processing delays to ensure that the encodedaudio is properly synchronized with any video content in media file 302.

The server-side encoding may be implemented under a number of differentoptions. A first option would be to implement a pre-encoded beacon,where the encoder (306) would be configured to perform real-timeencoding of the audio beacon based on the content being served to theusers/panelists. The user device would be equipped with a softwaredecoder as described above which is invoked when media is played. Thepre-encoded beacon would establish a message link which could be used,along with an identifier from the capturing portable device 140, inorder to assign credit. The encoding shared library would preferably beresident at the content site (100) as part of the encoding engine. Sucha configuration would allow the transcoding and encoding to be fit intothe content site workflow.

Another option for server-side encoding could include a pre-encoded dataload, where the audio is encoded with a message that is based on themetadata or the assigned URL. This establishes a message link which canbe used, along with an identifier from the capturing portable device140, in order to assign credit. The encoding shared library ispreferably resident at the content site (100), as part of the encodingengine. Again, this configuration would allow the transcoding andencoding to be fit into the content site workflow.

Yet another option for server-side encoding could include “on-the-fly”encoding. If a video is being streamed to a panelist, encoding may beinserted in the stream along with a transcoding object. The encoding maybe used to encode the audio with a simple one of N beacon, and thepanelist user device 130 would contain software decoding which isinvoked when the video is played. This also establishes a message linkwhich can be used, along with an identifier from the capturing portabledevice 140, in order to assign credit. The encoding shared library ispreferably resident at the content site (100), as part of the encodingengine. Under a preferred embodiment, an ActionScript would invoke thedecoding along with a suitable transcoding object.

FIG. 4 illustrates one embodiment for encoding media under a Flash Videoplatform 410, where the content is preferably encoded in advance. As rawaudio from a video file or other source 400 is received, the audio issubjected to water mark encoding 401, which may include techniquesdescribed above for the encoding. Once encoded, the audio is formattedas a Flash file using Adobe Tools 402 such as FLV Creator and SWFCompiler. Once compiled, the file is further formatted usingFlash-supported codecs (e.g., H.264, VP6, MPEG-4 ASP, Sorenson H.263)and compression 403 to produce a watermarked A/V stream or file 404.

FIG. 5 provides another alternate embodiment that illustratesclient-side encoding and processing. In this example, user device 130requests media data. In response to the request, a media file 531residing on content server 100 is subsequently streamed to the device'sbrowser 520 arranged on user's workspace 510. Media player 521 plays thestreamed content and produces raw audio 511. A client-side ActionScriptnotifies browser 522 and encoder 522 to capture the raw audio on thedevice's sound mixer, or microphone (not shown), and to encode datausing a suitable encoding technique described above. The encodingconstructs the data for an independent audio beacon using the capturedaudio and other data (e.g., device ID, HHID, etc.) where portable device140 picks up the beacon and forwards the data to an appropriate serverfor further processing and panel data evaluation.

Similar to the server-side embodiment disclosed in FIG. 3, care must betaken in the software to account for processing delays in audio pickupand (CBET) encoding of the audio beacon. Preferably, synchronizationbetween audio beacon playback and audio playback (specifically FLVplayback) should be accounted for. In alternate embodiments,communication between media player 521 and encoder 522 could be throughActionscript interface APIs, such as “ExternalInterface”, which is anapplication programming interface that enables straightforwardcommunication between ActionScript and a Flash Player container; forexample, an HTML page with JavaScript, or a desktop application withFlash Player embedded, along with encoder application 522. To getinformation on the container application, an ActionScript interfacecould be used to call code in the container application, including a webpage or desktop application. Additionally, ActionScript code could becalled from code in the container application. Also, a proxy could becreated to simplify calling ActionScript code from the containerapplication.

For the panel-side encoding, a beacon embodiment may be enabled byhaving an encoding message being one from a relatively small set (e.g.,1 of 12), and where each user device 130 is assigned a differentmessage. When portable device 140 detects the encoded message, itidentifies the user device 130. Alternately, the encoding message may bea hash of the site and/or URL information gleaned from the metadata.When a panelist portable device 140 detects and reports the encodedmessage, a reverse hash can be used to identify the site, where the hashcould be resolved on one or more remote server (e.g., sever 160).

In addition to the encoding techniques described above in connectionwith media content, a simplified beaconing configuration may be arrangedwhere the beacon operates as a complement to media data, independent ofthe media data, or providing a beacon where no specially encoded dataexists. Referred to herein as a “twinkle,” the simplified beaconingcomprises a constant amplitude acoustic signal or tone that is generatedon user device 130. This acoustic tone is then automatically encoded,preferably with identification data (e.g., device ID, HHID and/or PPMID) and a timestamp. The encoded acoustic tone would then be forwardedto portable device 140 for processing and identification.

The acoustic tone used for the twinkle is preferably embodied as apre-recorded constant amplitude tone that is transmitted atpredetermined times. The encoding is preferably performed using any ofthe techniques described above. Under one embodiment, the simplifiedbeaconing process would only forward the encoded, pre-recorded tone,independently of any audio data being received. Thus, referring back toFIG. 1, it is possible that user device 130 receives only other content114 from content server 110 in the form of text-based HTML. As PC meter132 records browsing information, the encoded tone is transmitted toportable device 140, where after further processing (see FIG. 1B), theuser identification data is merged into internet measurement database141. It is understood, that user device 140 may also receive audio data(encoded or unencoded) separately and in addition to other content.While the techniques described above would encode and forward audio datareceived, the simplified beacon (“twinkle”) would also transmit IDinformation to portable device 140, which, in conjunction with PC meter132, would subsequently merge panelist data into a common database.

In another exemplary embodiment, FIG. 6 illustrates audio signal 600represented as a spectrum of audio 610 over a period of time (e.g., 0.25seconds), where the energy intervals vary with frequency between 1200and 2200 Hz. Overlaid in black are discreet, narrowband code tones 602(e.g., CBET) opportunistically inserted into the audio using theprinciples of psychoacoustic masking. For encoded tones, one of which isillustrated as 602 in FIG. 6, the energy of the inserted code tonevaries with the level of audio, so more quiet portions of the frequencyspectrum (e.g., 604) receive little encoding energy and compared tolouder portions 605), which get proportionally more.

In contrast, the simplified encoding (“twinkle”) 603 is encoded andinserted at constant levels across the frequency spectrum, where thelevels are independent of the audio levels. This allows the simplifiedencoding to be pre-recorded, easily generated and capable of beingreused accross various and/or different content. The simplified encodingcould have the same message structure as the CBET encoding describedabove, utilizing a 10-tone symbol set. Alternately, other messagestructures are possible as well. As mentioned above, the twinkle may betransmitted automatically at regular intervals. Alternately the twinklemay be invoked by an ActionScript. If two players are utilized (i.e.,one for the media, and one for the twinkle), the ActionScript couldrelay a beacon for the media from user device 130 to portable device140, while simultaneously requesting a second (preferably invisible)Flash Player in the user device 130 to transmit the twinkle to portabledevice 140. Under a preferred embodiment, the ActionScript should invokeboth players at a common volume setting.

In certain embodiments, it is advantageous to configure the audio beaconor “twinkle” on a device so that it is not intrusive and/or distractingto the user. Additionally, the beacon may be arranged to have audiocharacteristics that make it easier and more robust to encode. Forexample, simulated environmental sounds, such as power supply fans, airvent exhaust, crowd/audience noise, ocean waves and such may be used asthe audio beacon sound in which encoded messages may be inserted. Inanother example, it is known that many computing devices, such aspersonal computers, phones, tablets and laptops, contain a pre-storedlibrary of audio that is used to alert or notify users. By creating andstoring sounds advantageously suited to encode data in the on-devicesound library, a more robust beacon may be utilized. Furthermore, sincethe sound used for the beacon is known a priori, the encoding on thedevice may be simplified, since the time, frequency, masking and otherencoding processes will be known.

Turning to FIG. 7, an exemplary embodiment is illustrated where userdevice 130 is configured with a control panel 700 that allows the deviceto control various aspects, including sound 701 that includes a residentsound library 720. In a preferred embodiment, a script, operating aloneor as part of meter software (see FIG. 1A, ref. 132), may controlaspects of sound library 720, such that specific sounds are used foraudio beacon encoding. Of course, it is possible for a user to manuallychange these features either directly through user device 130, orthrough a remote connection.

Sound library 720 comprises audio sounds (704-707) that are associatedwith one or more software applications 700-701 and/or events 702-703.“Events,” for the purposes of FIG. 7, include notifications (e.g.,receipt of email, social networking software status update, etc.) andalerts (e.g., start-up, shut-down, application error, etc.) that aremade pursuant to the device's operating system and associated softwarerunning on the device. In this example, a first application (APP1) 700and a second application (APP2) 701 are associated with respectivesounds 704-705 that may be triggered when the application is activated,or at predetermined times determined by the specific application.Applications 700-701 may include metering software 132 discussed abovein connection with FIG. 1A.

In one embodiment, each sound 704, 705 is associated with a respectiveapplication 700, 701. The sounds (704-705) are used for encodingmessages to form respective beacons (708-709). As data for the audiobeacon is being collected on user device 130 (e.g., user device ID, webhash, etc.), the sound (e.g., 704) is copied, data is encoded into itand the encoded copy is stored in a buffer or other suitable memory. Asfurther data for an audio beacon is collected, another copy of theoriginal sound (704) is made, and the further data is encoded and storedinto the buffer/memory. This process repeats for as long as necessary toform a string of encoded sounds. The sounds may be arranged sequentiallyor in other suitable formats. When an application 700 triggers a sound,a first beacon 708 is audibly transmitted. When the next trigger occurs,the next beacon is audibly transmitted, and so on, until thebuffer/memory is empty, or a predetermined amount of time has expired.In another embodiment, one application can control a plurality of sounds(704-705) and produce a plurality of encoded beacons (708-709).

Continuing with FIG. 7, events 702-703 may be associated with respectivesounds 706-707, similar to applications 700-701 discussed above, toproduce beacons 710-711. This configuration may be particularlyadvantageous to beacon information when a user performs an act on thedevice, such as open/close a browser window, open/close a tab on abrowser, open/close an application, etc. As data for the audio beacon isbeing collected (e.g., user device ID, web hash, etc.), the sound (e.g.,706) is copied, data is encoded into it and the encoded copy is storedin a buffer or other suitable memory. As further data for an audiobeacon is collected, another copy of the original sound (706) is made,and the further data is encoded and stored into the buffer/memory. Thisprocess repeats for as long as necessary to form a string of encodedsounds. Again, the sounds may be arranged sequentially or in othersuitable formats. When an event (702) is detected, a first beacon 710 isaudibly transmitted. When the next event occurs, the next beacon isaudibly transmitted, and so on, until the buffer/memory is empty, or apredetermined amount of time has expired.

As mentioned previously, the sounds 704-707 are preferably predeterminedand may simulate an environmental sound so as not to be intrusive ordistracting to the user. Additionally, the sound may be selected tocontain audio characteristics (e.g., having high masking levels incritical frequency bands) that makes it conducive to robust audioencoding. By using a predetermined sound for the audio beacon encoding,designers can have more flexibility in audibly beaconing data. Inaddition to audio characteristics, the predetermined sounds may havedifferent lengths as well. In an example where an application (700)controls multiple sounds, the sounds may be the same instance of onesound, but having different lengths (e.g., 5 sec., 10 sec., etc.). Incases where a device's (130) volume is lower, the application maydefault to a longer sound to increase the probability of the beacon codebeing detected. If the volume increases, the device 130 can switch to ashorter sound. This configuration has the added benefit of ensuring thatusers maintain a sufficient volume on their device to avoid longer (andpossibly more intrusive) beacons.

Various embodiments disclosed herein provide devices, systems andmethods for performing various functions using an audience measurementsystem that includes audio beaconing. Although specific embodiments aredescribed herein, those skilled in the art recognize that otherembodiments may be substituted for the specific embodiments shown toachieve the same purpose. As an example, although terms like “portable”are used to describe different components, it is understood that other,fixed, devices may perform the same or equivalent functions. Also, whilespecific communication protocols are mentioned in this document, oneskilled in the art would appreciate that other protocols may be used orsubstituted. This application covers any adaptations or variations ofthe present invention. Therefore, the present invention is limited onlyby the claims and all available equivalents.

1-18. (Canceled)
 19. A computing device comprising: memory includingcomputer readable instructions; and a processor to execute theinstructions to at least: select, based on a volume setting of thecomputing device, one of a plurality of sound signals stored at thecomputing device, the plurality of sound signals having differentlengths; encode a digital message in the selected one of the pluralityof sound signals to form an encoded sound signal; and cause the encodedsound signal to be audibly output from the computing device in responseto a trigger.
 20. The computing device of claim 19, wherein theprocessor is further to collect data to include in the digital message,the data to include at least one of an identifier associated thecomputing device or hash of a data associated with a web site accessedby the computing device.
 21. The computing device of claim 19, whereinto select the one of the plurality of sound signals, the processor isto: select a first one of the plurality of sound signals having a firstlength when the volume setting of the computing device is a first volumesetting; and select a second one of the plurality of sound signalshaving a second length that is longer than the first length when thevolume setting of the computing device is a second volume setting thatis lower than the first volume setting.
 22. The computing device ofclaim 19, wherein the processor is further to: store the encoded soundsignal in a buffer including a plurality of encoded sound signalsencoded with respective digital messages; and cause respective ones ofthe plurality of encoded sound signals included in the buffer to beaudibly output sequentially from the computing device in response to asequence of triggers.
 23. The computing device of claim 22, wherein theprocessor is to continue to cause respective ones of the plurality ofencoded sound signals included in the buffer to be audibly outputsequentially from the computing device in response to the sequence oftriggers until at least one of the buffer is empty or an amount of timehas expired.
 24. The computing device of claim 19, wherein the pluralityof sound signals is associated with an application to execute on thecomputing device, and the application is to trigger the processor tocause the encoded sound signal to be audibly output from the computingdevice.
 25. The computing device of claim 19, wherein the plurality ofsound signals is associated with an operating system event of thecomputing device, and occurrence of the operating system event is totrigger the processor to cause the encoded sound signal to be audiblyoutput from the computing device.
 26. A computer readable storage devicecomprising computer readable instructions that, when executed by aprocessor of a computing device, cause the processor to at least:select, based on a volume setting of the computing device, one of aplurality of sound signals stored at the computing device, the pluralityof sound signals having different lengths; encode a digital message inthe selected one of the plurality of sound signals to form an encodedsound signal; and cause the encoded sound signal to be audibly outputfrom the computing device in response to a trigger.
 27. The storagedevice of claim 26, wherein the instructions further cause the processorto collect data to include in the digital message, the data to includeat least one of an identifier associated the computing device or hash ofa data associated with a web site accessed by the computing device. 28.The storage device of claim 26, wherein to select the one of theplurality of sound signals, the instructions cause the processor to:select a first one of the plurality of sound signals having a firstlength when the volume setting of the computing device is a first volumesetting; and select a second one of the plurality of sound signalshaving a second length that is longer than the first length when thevolume setting of the computing device is a second volume setting thatis lower than the first volume setting.
 29. The storage device of claim26, wherein the instructions further cause the processor to: store theencoded sound signal in a buffer including a plurality of encoded soundsignals encoded with respective digital messages; and cause respectiveones of the plurality of encoded sound signals included in the buffer tobe audibly output sequentially from the computing device in response toa sequence of triggers.
 30. The storage device of claim 29, wherein theinstructions cause the processor to continue to cause respective ones ofthe plurality of encoded sound signals included in the buffer to beaudibly output sequentially from the computing device in response to thesequence of triggers until at least one of the buffer is empty or anamount of time has expired.
 31. The storage device of claim 26, whereinthe plurality of sound signals is associated with an application toexecute on the computing device, and the application is to trigger theprocessor to cause the encoded sound signal to be audibly output fromthe computing device.
 32. The storage device of claim 26, wherein theplurality of sound signals is associated with an operating system eventof the computing device, and occurrence of the operating system event isto trigger the processor to cause the encoded sound signal to be audiblyoutput from the computing device.
 33. A method for a computing device,the method comprising: selecting, by executing an instruction with aprocessor of the computing device and based on a volume setting of thecomputing device, one of a plurality of sound signals stored at thecomputing device, the plurality of sound signals having differentlengths; encoding, by executing an instruction with a processor, adigital message in the selected one of the plurality of sound signals toform an encoded sound signal; and audibly outputting the encoded soundsignal from the computing device in response to a trigger.
 34. Themethod of claim 33, further including collecting data to include in thedigital message, the data including at least one of an identifierassociated the computing device or hash of a data associated with a website accessed by the computing device.
 35. The method of claim 33,wherein the selecting of the one of the plurality of sound signalsincludes: selecting a first one of the plurality of sound signals havinga first length when the volume setting of the computing device is afirst volume setting; and selecting a second one of the plurality ofsound signals having a second length that is longer than the firstlength when the volume setting of the computing device is a secondvolume setting that is lower than the first volume setting.
 36. Themethod of claim 33, further including: storing the encoded sound signalin a buffer including a plurality of encoded sound signals encoded withrespective digital messages; and audibly outputting respective ones ofthe plurality of encoded sound signals from the buffer sequentially inresponse to a sequence of triggers until at least one of the buffer isempty or an amount of time has expired.
 37. The method of claim 33,wherein the plurality of sound signals is associated with an applicationto execute on the computing device, and the application is to triggerthe encoded sound signal to be audibly output from the computing device.38. The method of claim 33, wherein the plurality of sound signals isassociated with an operating system event of the computing device, andoccurrence of the operating system event is to trigger the processor tocause the encoded sound signal to be audibly output from the computingdevice.